State of the Art Control of Atari Games Using Shallow Reinforcement Learning
نویسندگان
چکیده
The recently introduced Deep Q-Networks (DQN) algorithm has gained attention as one of the first successful combinations of deep neural networks and reinforcement learning. Its promise was demonstrated in the Arcade Learning Environment (ALE), a challenging framework composed of dozens of Atari 2600 games used to evaluate general competency in AI. It achieved dramatically better results than earlier approaches, showing that its ability to learn good representations is quite robust and general. This paper attempts to understand the principles that underlie DQN’s impressive performance and to better contextualize its success. We systematically evaluate the importance of key representational biases encoded by DQN’s network by proposing simple linear representations that make use of these concepts. Incorporating these characteristics, we obtain a computationally practical feature set that achieves competitive performance to DQN in the ALE. Besides offering insight into the strengths and weaknesses of DQN, we provide a generic representation for the ALE, significantly reducing the burden of learning a representation for each game. Moreover, we also provide a simple, reproducible benchmark for the sake of comparison to future work in the ALE.
منابع مشابه
Atari Games and Intel Processors
The asynchronous nature of the state-of-the-art reinforcement learning algorithms such as the Asynchronous Advantage ActorCritic algorithm, makes them exceptionally suitable for CPU computations. However, given the fact that deep reinforcement learning often deals with interpreting visual information, a large part of the train and inference time is spent performing convolutions. In this work we...
متن کاملA Deep Learning Approach for Joint Video Frame and Reward Prediction in Atari Games
Reinforcement learning is concerned with learning to interact with environments that are initially unknown. State-of-the-art reinforcement learning approaches, such as DQN, are model-free and learn to act effectively across a wide range of environments such as Atari games, but require huge amounts of data. Modelbased techniques are more data-efficient, but need to acquire explicit knowledge abo...
متن کاملPlaying Atari Games with Deep Reinforcement Learning and Human Checkpoint Replay
This paper introduces a novel method for learning how to play the most difficult Atari 2600 games from the Arcade Learning Environment using deep reinforcement learning. The proposed method, called human checkpoint replay, consists in using checkpoints sampled from human gameplay as starting points for the learning process. This is meant to compensate for the difficulties of current exploration...
متن کاملSigmoid-Weighted Linear Units for Neural Network Function Approximation in Reinforcement Learning
In recent years, neural networks have enjoyed a renaissance as function approximators in reinforcement learning. Two decades after Tesauro's TD-Gammon achieved near top-level human performance in backgammon, the deep reinforcement learning algorithm DQN achieved human-level performance in many Atari 2600 games. The purpose of this study is twofold. First, we propose two activation functions for...
متن کاملMean Actor Critic
We propose a new algorithm, Mean Actor-Critic (MAC), for discrete-action continuous-state reinforcement learning. MAC is a policy gradient algorithm that uses the agent’s explicit representation of all action values to estimate the gradient of the policy, rather than using only the actions that were actually executed. This significantly reduces variance in the gradient updates and removes the n...
متن کامل